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Glossary 

Dynamical System In this article: a continuous transformation T of a compact metric space 
X. For each x € X, the transformation T generates a trajectory (or, Tx, T 2 x, . . .). 



Invariant measure In this article: a probability measure /ionI which is invariant under the 
transformation T, i.e., for which (/oT,/i) = (/, /i) for each continuous / : X — ► K. Here (/, /x) is 

\^0 \ a short-hand notation for J „ f dfj,. The triple (X, T, fi) is called a measure-preserving dynamical 

If} • system. 

(N 

_J , Ergodic theory Ergodic theory is the mathematical theory of measure-preserving dynamical 

f^S ' systems. 

OO ' 

Entropy In this article: the maximal rate of information gain per time that can be achieved by 

coarse grained observations on a measure-preserving dynamical system. This quantity is often 

denoted /i(/z). 

Equilibrium State In general, a given dynamical system T : X — > X admits a huge number of 
invariant measures. Given some continuous <j> : X — ► R ("potential"), those invariant measures 
which maximise a functional of the form F(fi) — h((i) + {<p, /i) are called "equilibrium states" for 

0. 

Pressure The maximum of the functional F(/j,) is denoted by P(<f>) and called the "topological 
pressure" of </>, or simply the "pressure" of <j>. 

Gibbs State In many cases, equilibrium states have a local structure that is determined by the 
local properties of the potential <f>. They arc called "Gibbs states". 

1 
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Sinai-Ruelle-Bowen measure Special equilibrium or Gibbs states that describe the statistics 
of the attractor of certain smooth dynamical systems. 

1. Definition of the Subject and Its Importance 

Gibbs and equilibrium states of one-dimensional lattice models in statistical physics play 
a prominent role in the statistical theory of chaotic dynamics. They first appear in the 
ergodic theory of certain differentiable dynamical systems, called "uniformly hyperbolic 
systems", mainly Anosov and Axiom A diffeomorphisms (and flows). The central idea 
is to "code" the orbits of these systems into (infinite) symbolic sequences of symbols by 
following their history on a finite partition of their phase space. This defines a nice shift 
dynamical system called a subshift of finite type or a topological Markov chain. Then the 
construction of their "natural" invariant measures and the study of their properties are 
carried out at the symbolic level by constructing certain equilibrium states in the sense 
of statistical mechanics which turn out to be also Gibbs states. The study of uniformly 
hyperbolic systems brought out several ideas and techniques which turned out to be 
extremely fruitful for the study of more general systems. Let us mention the concept 
of Markov partition and its avatars, the very important notion of SRB measure (after 
Sinai, Ruelle and Bowen) and transfer operators. Recently, there was a revival interest in 
Axiom A systems as models to understand nonequilibrium statistical mechanics. 

2. Introduction 

Our goal is to present the basic results on one-dimensional Gibbs and equilibrium states 
viewed as special invariant measures on symbolic dynamical systems, and then to de- 
scribe without technicalities a sample of results they allowed to obtain for certain differ- 
entiable dynamical systems. We hope that this contribution will illustrate the symbiotic 
relationship between ergodic theory and statistical mechanics, and also information the- 
ory. 

We start by putting Gibbs and equilibrium states in a general perspective. The theory 
of Gibbs states and equilibrium states, or Thermodynamic Formalism, is a branch of 
rigorous Statistical Physics. The notion of a Gibbs state dates back to R.L. Dobrushin 
(1968-1969) p21 UHl ESI EQ] and O.E. Lanford and D. Ruelle (1969) [H] who proposed it 
as a mathematical idealisation of an equilibrium state of a physical system which consists 
of a very large number of interacting components. For a finite number of components, 
the foundations of statistical mechanics were already laid in the nineteenth century. 
There was the well-known Maxwell-Boltzmann-Gibbs formula for the equilibrium distri- 
bution of a physical system with given energy function. From the mathematical point of 
view, the intrinsic properties of very large objects can be made manifest by performing 
suitable limiting procedures. Indeed, the crucial step made in the 1960's was to define 
the notion of a Gibbs measure or Gibbs state for a system with an infinite number of 
interacting components. This was done by the familiar probabilistic idea of specifying 
the interdependence structure by means of a suitable class of conditional probabilities 
built up according to the Maxwell-Boltzmann-Gibbs formula [29] . Notice that Gibbs 
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states are often called "DLR states" in honour of Dobrushin, Lanford and Ruelle. The 
remarkable aspect of this construction is the fact that a Gibbs state for a given type of 
interaction may fail to be unique. In physical terms, this means that a system with this 
interaction can take several distinct equilibria. The phenomenon of nonuniqueness of a 
Gibbs measure can thus be interpreted as a phase transition. Therefore, the conditions 
under which an interaction leads to a unique or to several Gibbs measures turns out to 
be of central importance. While Gibbs states are defined locally by specifying certain 
conditional probabilities, equilibrium states are defined globally by a variational princi- 
ple: they maximise the entropy of the system under the (linear) constraint that the mean 
energy is fixed. Gibbs states are always equilibrium states, but the two notions do not 
coincide in general. However, for a class of sufficiently regular interactions, equilibrium 
states are also Gibbs states. 

In the effort of trying to understand phase transitions, simplified mathematical models 
were proposed, the most famous one being undoubtedly the Ising model. This is an 
example of a lattice model. The set of configurations of a lattice model is X := A^ , 
where A is a finite set, which is invariant by "spatial" translations. For the physical 
interpretation, X can be thought, for instance, as the set of infinite configurations of a 
system of spins on a crystal lattice 7L d and A may be taken as = {+1, —1}, ie., spins can 
take two orientations, "up" and "down". The Ising model is defined by specifying an 
interaction (or potential) between spins and then study the corresponding (translation- 
invariant) Gibbs states. The striking phenomenon is that for d = 1 there is a unique 
Gibbs state (in fact a Markov measure) whereas if d > 2, there may be several Gibbs 
states although the interaction is very simple [29] . 



Equilibrium states and Gibbs states of one-dimensional lattice models (d = 1) played a 
prominent role in understanding the ergodic properties of certain types of differentiable 
dynamical systems, namely uniformly hyperbolic systems, Axiom A diffeomorphisms in 
particular. The link between one-dimensional lattice systems and dynamical systems 
is made by symbolic dynamics. Informally, symbolic dynamics consists in replacing 
the orbits of the original system by its history on a finite partition of its phase space 
labelled by the elements of the "alphabet" A. Therefore, each orbit of the original 
system is replaced by an infinite sequence of symbols, i.e., by an element of the set 
A or A , depending on the fact that the map describing the dynamics is invertible or 
not. The action of the map on an initial condition is then easily seen to correspond 
to the translation (or shift) of its associated symbolic sequence. In general there is no 
reason to get all sequences of A z or A N . Instead one gets a closed invariant subset X 
(a subshift) which can be very complicated. For a certain class of dynamical systems 
the partition can be successfully chosen so as to form a Markov partition. In this case, 
the dynamical system under consideration can be coded by a subshift of finite type (also 
called a topological Markov chain) which is a very nice symbolic dynamical system. Then 
one can play the game of statistical physics: for a given continuous, real- valued function 
(a "potential") on X, construct the corresponding Gibbs states and equilibrium states. 
If the potential is regular enough, one expects uniqueness of the Gibbs state and that it 
is also the unique equilibrium state for this potential. This circle of ideas - ranging from 
Gibbs states on finite systems over invariant measures on symbolic systems and their 
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(Shannon-)entropy with a digression to Kolmogorov-Chaitin complexity to equilibrium 
states and Gibbs states on subshifts of finite type - is presented in Sections [3] - 

At this point it should be remembered that the objects which can actually be observed 
are not equilibrium states (they are measures on X) but individual symbol sequences in 
X, which reflect more or less the statistical properties of an equilibrium state. Indeed, 
most sequences reflect these properties very well, but there are also rare sequences that 
look quite different. Their properties are described by large deviations principles which 
are not discussed in the present article. We shall indicate some references along the 
way. 

In Sections [7] and [8] we present a selection of important examples: measure of maximal 
entropy, Markov measures and Hofbauer's example of nonuniqueness of equilibrium state; 
uniformly expanding Markov maps of the interval, interval maps with an indifferent 
fixed point, Anosov diffeomorphisms and Axiom A attractors with Sinai-Ruelle-Bowen 
measures, and Bowen's formula for the Hausdorff dimension of conformal repellers. As 
we shall see, Sinai-Ruelle-Bowen measures are the only physically observable measures 
and they appear naturally in the context of nonuniformly hyperbolic diffeomorphisms 

A revival of the interest to Anosov and Axiom A systems occurred in statistical me- 
chanics in the 1990's. Several physical phenomena of nonequilibrium origin, like entropy 
production and chaotic scattering, were modelled with the help of those systems (by 
G. Gallavotti, P. Gaspard, D. Ruelle, and others). This new interest led to new results 
about old Anosov and Axiom A systems, see, e.g., [15] for a survey and references. In 
Section [9j we give a very brief account on entropy production in the context of Anosov 
systems which highlights the role of relative entropy. 

This article is a little introduction to a vast subject in which we have tried to put 
forward some aspects not previously described in other expository texts. For people 
willing to deepen their understanding of equilibrium and Gibbs states, there are the 
classic monographs by Bowen [5] and by Ruelle [58], the monograph by one of us [38], 
and the survey article by Chernov [15] (where Anosov and Axiom A flows are reviewed) . 
Those texts are really complementary. 



3. Warming Up: Thermodynamic Formalism for Finite Systems 

We introduce the thermodynamic formalism in an elementary context, following Jaynes 
|34| . In this view, entropy, in the sense of information theory, is the central con- 
cept. 

Incomplete knowledge about a system is conveniently described in terms of probability 
distributions on the set of its possible states. This is particularly simple if the set of 
states, call it X, is finite. Then the equidistribution on X describes complete lack of 
knowledge, whereas a probability vector that assigns probability 1 to one single state and 
probability to all others represents maximal information about the system. A well es- 
tablished measure of the amount of uncertainty represented by a probability distribution 
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v = {y{x))x£.x is its entropy 

x£X 

which is zero if the probability is concentrated in one state and which attains its maxi- 
mum value log |X| if v is the equidistribution on X, i.e., if v{x) = |^| _1 for all x £ X. In 
this completely elementary context we will explore two concepts whose generalisations 
are central to the theory of equilibrium states in ergodic theory: 

• equilibrium distributions - defined in terms of a variational problem, 

• the Gibbs property of equilibrium distributions, 

The only mathematical prerequisite for this section are calculus and some elements from 
probability theory. 

3.1. Equilibrium Distributions and the Gibbs Property. Suppose that a finite 
system can be observed through a function U : X — > R (an "observable"), and that we are 
looking for a probability distribution [i which maximises entropy among all distributions 
v with a prescribed expected value (U, v) := Ylx&x v {x)U{x) for the observable U. This 
means we have to solve a variational problem under constraints: 

(1) H{p) = max{H(y) : (U, v) = E} 

As the function v i— > H(y) is strictly concave, there is a unique maximising probability 
distribution \x provided the value E can be attained at all by some {U,v). In order to 
derive an explicit formula for this fj, we introduce a Lagrange multiplier /3 € R and study, 
for each /3, the unconstrained problem 

(2) Hinp) + (PU, lip) = P (J3U) := max(#0) + (pU, v)) 

In analogy to the convention in ergodic theory we call p(/3(j)) the pressure of (3(p an d 
the maximiser /ig the corresponding equilibrium distribution (synonymously equilibrium 
state). 

The equilibrium distribution fj,p satisfies 

(3) iip{x) = exp(-p(0U) + PU(x)) for all x G X 

as an elementary calculation using Jensen's inequality for the strictly convex function 
1 1— > — ilogi shows: 

P 0U(x) p PU{x) 

H(v) + (PU, v) = Y< v (?) lo S ~TT ^ lo § E ^ X )~TT = lo £ E e/3C/(x) ' 
xex v ; x&x v ; x&x 

with equality if and only if e^ is a constant multiple of v. The observation that v = fxp 
is a maximiser proves at the same time that p{(3U) = log X^ex e@ u ( x ' . 

The equality expressed in (J3J) is called the Gibbs property of /ig, and we say that /j,p is 
a Gibbs distribution if we want to stress this property. 

In order to solve the constrained problem (JTJ) it remains to show that there is a unique 
multiplier (3 = (3{E) such that (U, jip) = E. This follows from the fact that the map 
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[3 i— > (U, Up) maps the real line monotonically onto the interval (min U, max U) which, in 
turn, is a direct consequence of the formulas for the first and second derivative of p(f3U) 
w.r.t (3: 



dp d 2 p 



(4) 2s = (U,Hf,), ^ = (U\w)-(U,^ 



As the second derivative is nothing but the variance of U under /j,p, it is strictly positive 
(except when U is a constant function), so that f3 i— ► (U, fip) is indeed strictly increasing. 
Observe also that -£ is indeed the directional derivative of p : W —>■ K in direction 
U. Hence the first identity in (J3J) can be rephrased as: fj,p is the gradient at /3U of the 
function p. 

A similar analysis can be performed for an R -valued observable (j). In that case a vector 
j3 £ R of Lagrange multipliers is needed to satisfy the d linear constraints. 

3.2. Systems on a Finite Lattice. We now assume that the system has a lattice 
structure, modelling its extension in space, for instance. The system can be in different 
states at different positions. More specifically, let L n = {0, 1, ... ,n — 1} be a set of n 
positions in space, let A be a finite set of states that can be attained by the system at 
each of its sites, and denote by X := A n the set of all configurations of states from A 
at positions of L n . It is helpful to think of X as the set of all words of length N over 
the alphabet A. We focus on observables U n which are sums of many local contributions 
in the sense that U n (ao . . . a n -i) = Y27=o 4>{ a i ■ ■ ■ a i+r-i) for some "local observable" 
(j) : A r — > R. (The index % + r — 1 has to be taken modulo n.) In terms of cf> the 
maximising measure can be written as 

(5) fj,/3(a ■ ■ ■ dn-i) = exp -nP((3<f>) + ^ 0(a» . . . a i+r _i) 

where P((3<fi) ■= n~ 1 p((3U n ). A first immediate consequence of ([5]) is the invariance 
of [ip under a cyclic shift of its argument, namely \ip{a\ ■ ■ ■ a n _iao) = fAp(ao . . . a n _i). 
Therefore we can restrict the maximisations in (fTJ) and (|2|) to probability distributions 
v which are invariant under cyclic translations which yields 

(6) P{P4>) = max (n- x H{v) + {(3<f>, u))= n - 1 H{^) + (/3</>, up). 

If the local observable <j) depends only on one coordinate, fip turns out to be a product 
measure: 

n-i 

/j,p(a . . . a n -i) = JJexp(-P(/30) + (3cp(ai)) . 

i=0 

Indeed, comparison with ([3]) shows that \ip is the n-fold product of the probability 
distribution /i^ c on A that maximises H (y) + (3v((f)) among all distributions v on A. It 
follows that n~ 1 H(fip) = H{[i l 2 c ) so that Q implies P((3<p) = p((34>) for observables (j) 
that depend only on one coordinate. 
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4. Shift spaces, Invariant Measures and Entropy 

We now turn to shift dynamical systems over a finite alphabet A. 

4.1. Symbolic Dynamics. We start by fixing some notation. Let N denote the set 
{0, 1,2,...}. In the sequel we need 

- a finite set A (the "alphabet"), 

- the set A of all infinite sequences over A, i.e., the set of all x = xqX\ . . . with 
x n G A for all n G N. 

- the translation (or shift) a : A — > A , (ax) n = x n+ i, for all n G N, 

- a shift invariant subset X = cr(X) of A^. With a slight abuse of notation we 
denote the restriction of a to X by a again. 

We mention two interpretations of the dynamics of a: it can describe the evolution of a 
system with state space X in discrete time steps (this is the prevalent interpretation if 
a : X — > X is obtained as a symbolic representation of another dynamical system), or 
it can be the spatial translation of the configuration of a system on an infinite lattice 
(generalising the point of view from Subsection I3.2p . In the latter case one usually looks 
at the shift on the two-sided shift space A , for which the theory is nearly identical. 

On A one can define a metric d by 

(7) d(x,y) := 2~ N ^'V) where N(x,y) := mm{& eN:i^ y k }. 

Hence d(x, y) = 1 if and only if xq / j/q> an d d(x, x) = upon agreeing that N(x, x) = oo 
and 2 _oc = 0. Equipped with this metric, ^4 becomes a compact metric space and a is 
easily seen to be a continuous surjection of A . Finally, if X is a closed subset of A , we 
call the restriction a : X — ► X, which is again a continuous surjection, a shift dynamical 
system. We remark that d generates on A the product topology of the discrete topology 
on A, just as many variants of d do. For more details see [[Marcus]]. As usual, C{X) 
denotes the space of real- valued continuous functions on X equipped with the supremum 
norm || • W^. 

4.2. Invariant Measures. A probability distribution v (or simply distribution) on 
X is a Borel probability measure on X. It is unambiguously specified by its values 
v[ao . . . a n -i] (n £ N, a% G A) on cylinder sets 

[ao • • ■ a n -i] '•= {x £ X : Xi = a* for alii = 0, . . . ,n — 1}. 

Any bounded and measurable / : X — ► R (in particular any / G C(X)) can be integrated 
by any distribution v. To stress the linearity of the integral in both, the integrand and 
the integrator, we use the notation 



(M-.= / fdu. 

Jx 
In probabilistic terms, (/, v) is the expectation of the observable / under v. The set 
M(X) of all probability distributions is compact in the weak topology, the coarsest 
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topology on A4(X) for which v \—t (/, v) is continuous for all / G C{X), see [[Petersen, 
6.1]]. (Note that in functional analysis this is called the weak-* topology.) Henceforth 
we will use both terms, "measure" and "distribution", if we talk about probability 
distributions. 

A measure v on X is invariant if expectations of observables are unchanged under the 
shift, i.e., if 

(/ o a, v) = (f, v) for all bounded measurable / : X — ► R. 

The set of all invariant measures is denoted by M a {X). As a closed subset of Ai(X) it 
is compact in the weak topology. Of special importance among all invariant measures v 
are the ergodic ones which can be characterised by the property that, for all bounded 
measurable / : X — ► R, 

1 n— 1 
(8) lim — > f(o~ x) = (f,v) for z/-ae(almost every) x, 

n— >oo fi ^— ' 

fc=0 

i.e., for a set of x of z/- measure one. They are the indecomposable "building blocks" of 
all other measures in M. a (X), see [[Petersen, 6.2]] or [[del Junco]]. The almost every- 
where convergence in (JBJ) is Birkhoff's ergodic Theorem [[del Junco]], the constant limit 
characterises the ergodicity of v. 

4.3. Entropy of Invariant Measures. We give a brief account of the definition and 
basic properties of the entropy of the shift under an invariant measure v. For details 
and the generalisation of this concept to general dynamical systems we refer to [[King]] 
or [37], and to |36| for an historical account. 



Let v € M. ff (X). For each n > the cylinder probabilities u[ao ■ ■ ■ a n -i] give rise to a 
probability distribution on the finite set A n , see section [3j so 

H n (v):=- ^2 i/[o ...o Tl _i]logi/[oo...a Tl _i] 

ao,...,a„-i€A 

is well defined. Invariance of v guarantees that the sequence {H n {v)) n> Q is subadditive, 
i.e., Hk +n {v) < Hk(v) + H n (v), and an elementary argument shows that the limit 

(9) h(u):= lim i# n (z/)e[0,logL4|] 

n— »oo 77, 

exists and equals the infimum of the sequence. We simply call it the entropy of v. (Note 
that for general subshifts X many of the cylinder sets [ao ■ ■ ■ fln-i] ^ X are empty. But, 
because of the continuity of the function t i— > tlogt at t = 0, we set OlogO = 0, and, 
hence, this does not affect the definition of H n (y).) 

The entropy h(v) of an ergodic measure v can be obtained along a "typical" trajectory. 
That is the content of the following theorem, sometimes called the "ergodic theorem of 
information theory". 

Shannon-McMillan-Breiman Theorem: 

(10) lim — log v\xq . . . x n —i] = —h(v) for is-eex. 

n— >oo n 
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Observe that is just the integrated version of this statement. A slightly weaker refor- 
mulation of this theorem (again for ergodic v) is known as the "asymptotic equipartition 
property" . 

Asymptotic Equipartition Property: 

Given (arbitrarily small) e > and a > 0, one can, for each sufficiently large n, 
partition the set A n into a set T n of typical words and a set £ n of exceptional 
words such that each ao ■ • • a n -l £ %i satisfies 

(U) e -n(h(u)+a) < ^ _ _ Qn _^ < e -n(h(u)-a) 

and the total probability Y^ a a e£ v [ a o ■ ■ ■ a n-i] of the exceptional words is 
at most e. 

4.4. A Short Digression on Complexity. Kolmogorov [ID] and Chaitin P3] intro- 
duced the concept of complexity of an infinite sequence of symbols. Very roughly it 
is defined as follows: First, the complexity K (xq . . . x n _i) of a finite word in A n is 
defined as the bit length of the shortest program that causes a suitable general pur- 
pose computer (say a PC or, for the mathematically minded reader, a Turing ma- 
chine) to print out this word. Then the complexity of an infinite sequence is defined 
as K(x) := lfmsup n _ >00 ^-K(xq . . . x n _i). Of course, the definition of K(xq ■ ■ ■ x n -\) de- 
pends on the particular computer, but as any two general purpose computers can be 
programmed to simulate each other (by some finite piece of software), the limit K(x) 
is machine independent. It is the optimal compression factor for long initial pieces of a 
sequence x that still allows complete reconstruction of x by an algorithm. Brudno [8] 
showed: 

If X C A N and v G M a (X) is ergodic, then K(x) = YV2^{v) for v-sex € X. 

4.5. Entropy as a Function of the Measure. An important technical remark for the 
further development of the theory is that the entropy function h : Ai a (X) — > [0, , oo) is 
upper semicontinuous. This means that all sets {v : h{v) > t} with t£R are closed and 
hence compact. In particular, upper semicontinuous functions attain their supremum. 
Indeed, suppose a sequence vj, £ M. a (X) converges weakly to some v € M. a {X) and 
h(vk) > t for all k so that also -H n {vjS) > t for all n and k. As H n (u) is an expression that 
depends continuously on the probabilities of the finitely many cylinders [ao . . . a n -\] and 
as the indicator functions of these sets are continuous, ^H n (v) = lim/^oo ^H n (vk) > t, 
hence h(v) > t in the limit n — > oo. 

A word of caution seems in order: the entropy function is rarely continuous. For ex- 
ample, on the full shift X = A each invariant measure, whatever its entropy is, can 
be approximated in the weak topology by equidistributions on periodic orbits. But all 
these equidistributions have entropy zero. 

5. The Variational Principle: a Global Characterisation of Equilibrium 

Usually, a dynamical systems model of a "physical" system consists of a state space 
and a map (or a differential equation) describing the dynamics. An invariant measure 
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for the system is rarely given a priori. Indeed, many (if not most) dynamical systems 
arising in this way have uncountably many ergodic invariant measures. This limits 
considerably the "practical value" of Birkhoff 's ergodic theorem (|SJ) or the Shannon- 
McMillan-Breiman theorem f)10[> : not only do the limits in these theorems depend on 
the invariant measure u, but also the sets of points for which the theorems guarantee 
almost everywhere convergence are practically disjoint for different v and v' in A4 a (X). 
Therefore a choice of v has to be made which reflects the original modelling intentions. 
We will argue in this and the next sections that a variational principle with a judiciously 
chosen "observable" may be a useful guideline - generalising the observations for finite 
systems collected in Section [3l As announced earlier we restrict again to shift dynamical 
systems, because they are rather universal models for many other systems. 

5.1. Equilibrium States. We define the pressure of an observable <fi G C(X) as 

(12) P{(f>) := sup{/iH + (</>, v) : u € M a {X)}. 

Since A4 a (X) is compact and the functional v \— ► h(i>) + (<p, u) is upper semicontinuous, 
the supremum is attained - not necessarily at a unique measure as we will see (which is 
a remarkable difference to what happens in finite systems). Each measure v for which 
the supremum is attained is called an equilibrium state for (p. Here the word "state" 
is used synonymously with "distribution" or "measure" - a reflection of the fact that 
in "well-behaved cases", as we will see in the next section, this measure is uniquely 
determined by the constraint (s) under which it maximises entropy, and that means by 
the macroscopic state of the system. (In contrast, the word "state" was used in Section |3] 
to designate microscopic states.) 

As, for each v £ M a (X), the functional (p i— ► h(v) + ((p,v) is affine on C(X), the 
pressure functional P : C(X) — > R, which, by definition, is the pointwise supremum of 
these functionals, is convex. It is therefore instructive to fit equilibrium states into the 
abstract framework of convex analysis [32|. [38l 05J [681 • To this end recall the identities 
in ([3]) that identify, for finite systems, equilibrium states as gradients of the pressure 
function p : W ' —>■ R and guarantee that p is twice differentiable and strictly convex. In 
the present setting where P is defined on the Banach space C(X), differentiability and 
strict convexity are no more guaranteed, but one can show: 

Equilibrium states as (sub)-gradients: 

\i £ M a (X) is an equilibrium state for (p if and only if /x is a subgradient 
(or tangent functional) for P at cp, i.e., if P((f> + tp) — P{(p) > (ip,^) for all 

(13) V ^ C(X). In particular, (p has a unique equilibrium state \x if and only if P is 
differentiable at <p> with gradient \i, i.e., if limt_+o \ (P{(p + tip) — P(4>)) = (Vs f 1 ) 
for all ij € C(X). 

Let us see how equilibrium states on X = A can directly be obtained from the 
corresponding equilibrium distributions on finite sets A n of Subsection 13.21 Define 
(j)(n) : ^4 n — > ]R by (p( n >(ao . . . a n -i) := <?H a o • • • On-lOo • • • a n -i ■ ■ ■ )> denote by U n the 
corresponding global observable on A n , and let fj, n be the equilibrium distribution on 
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A n that maximises H(fi) + (U n ,fj,). Then all weak limit points of the "approximative 
equilibrium distributions" fi n on A n are equilibrium states on A^. 

► This can be seen as follows: Let the measure /x on A N be any weak limit point of the /i n . 
Then, given e > there exists k £ N such that 

h(ji) + (<f>,n) > T H k (n) + (<f>,n) - e > T H k {Hn) + {4> {n) ,Hn) - 2e 
k k 

for arbitrarily large n, because \\(/> — ^^||oo - * as n — > oo by construction of the <f>( n >. As the fi n 
are invariant under cyclic coordinate shifts (see Subsection 13. 2p . it follows from the subadditivity 
of the entropy that 

1 k 

h{n) + (0, fj) > -(H n ( l i n ) + (U n ,Hn})-2e--log\A\. 

n n 

Hence, for each v 6 M a {X), 

h(p) + (0, fx) > -{H n {v) + (U n , v)) - 2e - - log \A\ -► h(u) + (0, v) - 2e 
n n 

as n — > oo, and we see that /i is indeed an equilibrium state on A N . -^ 

5.2. The Variational Principle. In Subsection l3.ll the pressure of a finite system was 
defined as a certain supremum and then identified as the logarithm of the normalising 
constant for the Gibbsian representation of the corresponding equilibrium distribution. 
We are now going to approximate equilibrium states by suitable Gibbs distributions 
on finite subsets of X. As a by-product the pressure P{4>) is characterised in terms of 
the logarithms of the normalising constants of these approximating distributions. Let 
S n 4>(x) := <fi(x) + 4>{(Jx) + • • • + 4>{o n ~ l x). From each cylinder set [oq . . . a n -\] we can 
pick a point z such that S n 4>(z) is the maximal value of S n cj> on this set. We denote 
the collection of the \A\ n points we obtain in this way by E n . Observe that E n is not 
unambiguously defined, but any choice we make will do. 

Variational principle for the pressure: 

(14) P((/>) = limsup-P n (</>) where P n {<p) := log V e 5 "^ 

n— >oo Tl 

z£E n 



► To prove the "<" direction of this identity we just have to show that —H n (v) + {4>,v) < 
—P n (<j>) for each v 6 A4 a (X) or, after multiplying by n, H n (v) + (S n <fi, v) < P n (<fr)- But Jensen's 
inequality implies: 

v[a . . .o„_il 



ffM , /c , ». \r- r n ( sup{e s »^> :xe[a ... a n -i}} 

H n {v) + {S n <p,v} < ) i/[a ... a„-i\ log [ = 

, V v\oq..."- ^ 

< log Y. SU P {e S ^ { - } ■ x e [ao ■ • • a n _i]} 



oo,---,a„_i£-A 



log X] e s "^=P„(0). 



zefi„ 
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For the reverse inequality consider the discrete Gibbs distributions 

7T„ := ^ &* exp(-P n (0) + S n <j)(z)) 

z£E„ 

on the finite sets E n , where S z denotes the unit point mass in z. One might be tempted to think 
that all weak limit points of the measures 7r„ are already equilibrium states. But this need not be 
the case because there is no good reason that these limits are shift invariant. Therefore one forces 
invariancc of the limits by passing to measures fj, n defined by (f,fJ, n ) '•= (— Y^k=o f ° cr ' C i 7r n)- 
Weak limits of these measures are obviously shift invariant, and a more involved estimate we do 
not present here shows that each such weak limit /x satisfies /i(/x) + (4>,n) > P{4>)- "^ 

We note that the same arguments work for any other sequence of sets E n which contain 
exactly one point from each cylinder. So there are many ways to approximate equilibrium 
states, and if there are more than one equilibrium state, there is generally no guarantee 
that the limit is always the same. 

5.3. Nonuniqueness of Equilibrium States: an Example. Before we turn to suffi- 
cient conditions for the uniqueness of equilibrium states in the next section, we present 
one of the simplest nontrivial examples for nonuniqueness of equilibrium states. Mo- 
tivated by the so-called Fisher-Felderhof droplet model of condensation in statistical 
mechanics [241 125] . Hofbauer [31] studies an observable <f) on X = {0,1} defined as 
follows: Let (afe) be a sequence of negative real numbers with Hindoo au = 0. Set 
Sk '■= o-o + ' ' ' + Ofc- For k > 1 denote Mfc := {x £ X : xq = ■ ■ ■ = Xk-i = 1, Xk = 0} and 
Mq := {x € X : xq = 0}, and define 

<j)(x) := a^ for x € M^ and 0(11 . . . ) = . 

Then <j) : X — ► K is continuous, so that there exists at least one equilibrium state for </>. 
Hofbauer proves that there is more than one equilibrium state if and only if XlfcLo eSh = ^ 
and J2 < kLo(k J t-l) eSk < °°- I n that case P{4>) = 0, so one of these equilibrium states is the 
unit mass <5n..., and we denote the other equilibrium state by /xi, so /i(/ii) + (</>, /ii) = 0. 
In view of (|13j) the pressure function is not differentiable at 4>. 

How does the pressure function (3 \— > P(/3(j)) look like? As h(6n„.) + ((3(f), <5ii...) = for 

all /3, P{(34>) > for all (3. Observe now that (f>(x) < with equality only for x = 11 

This implies that (</>, /i) < for all /i £ M a (X) different from 8\\,„. From this we can 
conclude: 

- P(P4>) < P{4>) = for fi > 1, so P{(3(/)) = for (3 > 1. 

- P(J34>) > h(jn) + W,/ii} = h(m) + (hm) - (l - p){<l>,iii) = -(i - P)(<f>,tn). 

It follows that, at [3 = 1, the derivative from the right of P((3(j>) is zero, whereas the 
derivative from the left is at least — ((f), /ii) > 0. 

5.4. More on Equilibrium States. In more general dynamical systems the entropy 
function is not necessarily upper semicontinuous and hence equilibrium states need not 
exist, i.e., the supremum in (|12|) need not be attained by any invariant measure. A 
well known sufficient property that guarantees the upper semicontinuity of the entropy 
function is the expansiveness of the system, see, e.g., [53] : a continuous transformation 
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T of a compact metric space is positively expansive, if there is a constant 7 > such that 
for any two points x and y from the space there is some n € N such that T n x and T n y 
are at least a distance 7 apart. If T is a homeomorphism one says it is expansive, if the 
same holds for some n € Z. The previous results carry over without changes (although at 
the expense of more complicated proofs) to general expansive systems. The variational 
principle (|14p holds in the very general context where T is a continuous action of Z+ on 
a compact Hausdorff space X. This was proved in [44j in a simple and elegant way. In 
the monograph [55] it is extended to amenable group actions. 



6. The Gibbs Property: a Local Characterisation of Equilibrium 

In this section we are going to see that, for a sufficiently regular potential 4> on a topo- 
logically mixing subshift of finite type, one has a unique equilibrium state which has 
the "Gibbs property". This property generalises formula ([5]) that we derived for finite 
lattices. Subshifts of finite type are the symbolic models for Axiom A diffeomorphisms, 
as we shall see later on. 

6.1. Subshifts of Finite Type. We start by recalling what is a subshift of finite type 
and refer the reader to [[Marcus]] or [33] for more details. Given a "transition matrix" 
M = {Mai^a^A whose entries are O's or l's, one can define a subshift Xm as the set of 
all sequences x G A^ such that M XiXi+1 = 1 for all i € N. This is called a subshift of finite 
type or a topological Markov chain. We assume that there exists some integer p$ such 
that M p has strictly positive entries for all p > pq. This means that M is irreducible and 
aperiodic. This property is equivalent to the property that the subshift of finite type 
is topologically mixing. A general subshift of finite type admits a decomposition into a 
finite union of transitive sets, each of which being a union of cyclically permuted sets 
on which the appropriate iterate is topologically mixing. In other words, topologically 
mixing subshifts of finite type are the building blocks of subshifts of finite type. 

6.2. The Gibbs Property for a Class of Regular Potentials. The class of regular 
potentials we consider is that of "summable variations". We denote by var&(</>) the 
modulus of continuity of eft on cylinders of length k > 1, that is, 

var fc (» := sup{|</>(x) - </%)] : x € [y . . . Vk-l]}- 

If varfc(i^) — > as k — > 00, this means that eft is (uniformly) continuous with respect to 
the distance ©. We impose the stronger condition 

00 
(15) ^var fc (0)<oo. 

fc=i 
We can now state the main result of this section. 

The Gibbs state of a summable potential. Let Xm be a topologically mixing 
subshift of finite type. Given a potential (ft : Xm — * K satisfying the summability 
condition (fl"5|) . there is a (probability) measure fj,^ supported on Xm, that we call a 
Gibbs state. It is the unique cr-invariant measure which satisfies the property: 
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There exists a constant C > such that, for all x £ X^ and for all n > 0, 

(16) C- 1 ^ ^°/V a;W "pL^ C7 - ("Gibbs property") 

exp(5 n 0(x) -nP((p)) 

Moreover, the Gibbs state \xa, is ergodic and is also the unique equilibrium state of </>, 
i.e., the unique invariant measure for which the supremum in (|12j) is attained. 

We now make several comments on this theorem. 



The Gibbs property (|16p gives a uniform control of the measure of all cylin- 
ders in terms of their "energy". This strengthens considerably the asymptotic 
equipartition property (JTTj) that we recover if we restrict (fT6j) to the set of ^ 
measure one where Birkhoff 's ergodic Theorem (|HJ) applies, and use the identity 
(</>,/^) -P(0) = -hip,;,). 

Gibbs measures on topologically mixing subshifts of finite type are ergodic (and 
actually mixing in a strong sense) as can be inferred from Ruelle's Perron- 
Frobenius Theorem (see Subsection 16. 3|) . 

Suppose that there is another invariant measure // satisfying (I16p . possibly with 
a constant C different from C. It is easy to verify that fi' = ffx for some 
/z-integrable function / by using (|16D and the Radon-Nikodym theorem. Shift 
invariance imposes that, /z-ae, / = / o a. Then the ergodicity of \i implies that / 
is a constant /i-se, thus \j! = \x \ see [5]. 

One could define a Gibbs state by saying that it is an invariant measure \i 
satisfying (fT6j) for a given continuous potential 4>. If one does so, it is simple to 
verify that such a \x must also be an equilibrium state. Indeed, using (fT6|) . one 
can deduce that (4>, //) + h(/i) > P{4>). The converse need not be true in general, 
see Subsection 17.41 But the summability condition (|15|) is indeed sufficient for 
the coincidence of Gibbs and equilibrium states. A proof of this fact can be 
found in [581 or 1381. 



6.3. Ruelle's Perron-Frobenius Theorem. The powerful tool behind the theorem in 
the previous subsection is a far-reaching generalisation of the classical Perron-Frobenius 
theorem for irreducible matrices. Instead of a matrix, one introduces the so-called trans- 
fer operator, also called the "Perron-Frobenius operator" or "Ruelle's operator", which 
acts on a suitable Banach space of observables. It is D. Ruelle |52j who first intro- 
duced this operator in the context of one-dimensional lattice gases with exponentially 
decaying interactions. In our context, this corresponds to Holder continuous potentials: 
these are potentials satisfying var^(0) < c6 k for some c > and 9 £ (0, 1). A proof of 
"Ruelle's Perron-Frobenius Theorem" can be found in [U(5]. It was then extended to 
potentials with summable variations in [67] . We refer to the book of V. Baladi [TJ for a 
comprehensive account on transfer operators in dynamical systems. 
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We content ourselves to define the transfer operator and state Ruelle's Perron-Frobenius 
Theorem. Let Jgf : C(X M ) -> C{X M ) be defined by 

(J2f/)(x):= X] e ^ } /(y)= E ^/(az). 

yGcr _1 x aeA:M(o,a!o)=l 

(Obviously, a:r := axo^i . . . .) 

Ruelle's Perron-Frobenius Theorem. Let X^ be a topologically mixing subshift 
of finite type. Let <f> satisfy condition (|T5l) . There exist a number A > 0, h G C(Xm), 
and z/ G .M(X) such that /z > 0, (/t, v) = 1, J5f/i = \h, ££*v = Az/, where 5£* is the dual 
of Jgf. Moreover, for all / G C(X M ), 

By using this theorem, one can show that /j,^ := hu satisfies (J16D and A = e "w. 

Let us remark that for potentials which are such that </>(x) = ^)(xo,^i) (i.e., potentials 
constant on cylinders of length 2), J£ can be identified with a |.A| x \A\ matrix and the 
previous theorem boils down to the classical Perron-Frobenius theorem for irreducible 
aperiodic matrices [63]. The corresponding Gibbs states are nothing but Markov chains 
with state space A [29} Chapter 3]. We shall take another point of view below (Subsec- 
tion E2J). 

6.4. Relative Entropy. We now define the relative entropy of an invariant measure 
v G M ff (Xtf) given a Gibbs state fi^ as follows. We first define 

(17) H n [y\n^):= V" v[a . . . a n _i] log — °'" n ~ l 

^ c . (is[ao ■ ■ . On-l\ 

a ,...,a n -ieA ^ L 

with the convention 01og(0/0) = 0. Now the relative entropy of v given [x^, is defined 
as 

h,{y\\i$) ^limsup-i^H^). 

n— >oo Tl 

(By applying Jensen's inequality, one verifies that h{v\jjis) > 0.) In fact the limit exists 
and can be computed quite easily using ([TBI) : 

(18) h(v\ H )=P(4>)-(4>,v)-h(v). 

► To prove this formula, we first make the following observation. It can be easily verified 
that the inequalities in (fTB|) remain the same when S n (f> is replaced by the "locally averaged" 
energy <f> n := (v[x Q . . . a;„_i]) _1 J [xo ^ Xn _ l] S n tj>(y) dv(y) for any cylinder with v[x . . . x n -i] > 0. 
Cylinders with v measure zero does not contribute to the sum in (fTTf . 

We can now write that 

--logC < --H n {v\ H )+(p{<j>) - -(S n <f>,v) - -H n (v)) < -logC. 
n n \ n n J n 

To finish we use that (S n <f>, v) = n(</>, v) (by the invariance of v) and we apply j9]) to obtain 
lim -E n {v\ H ) = P{4>) - (0, v) - lim -H n {v) = P{<j>) - {<(,, v) - h(v) 

n— >oo 77, n— »oc XI 
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which proves (fTB)) . M 



The variational principle revisited. We can reformulate the variational principle 
in the case of a potential satisfying the summability condition (|15p : 

(19) h{y\iJ,(b) =0 if and only if v = fj,^, 

i.e., given /j,^, the relative entropy h(-\/j,j,), as a function on M (Xm), attains its mini- 
mum only at fi^. 

Indeed, by (fl8l) we have h{v\^) = P(4>) — (</>, v) — h(v). We now use (fl2l) and the fact 
that [I,/, is the unique equilibrium state of <f> to conclude. 



6.5. More Properties of Gibbs States. Gibbs states enjoy very good statistical prop- 
erties. Let us mention only a few. They satisfy the "Bernoulli property", a very strong 
qualitative mixing condition [JJ O [67]. The sequence of random variables (/ o a n ) n sat- 
isfies the central limit theorem [151 \W[ [39] and a large deviation principle if / is Holder 
continuous (2TJ [381 BH1 EO] • Let us emphasise the central role played by relative entropy 
in large deviations. (The deep link between thermodynamics and large deviations is 
described in [32] in a much more general context.) Finally, the so-called "multifractal 
analysis" can be performed for Gibbs states, see, e.g., 



7. Examples on Shift Spaces 

7.1. Measure of Maximal Entropy and Periodic Points. If the observable <p is 
constant zero, an equilibrium state simply maximises the entropy. It is called measure of 
maximal entropy. The quantity -P(O) = s\xp{h(iy) : v £ A4 a (X)} is called the topological 
entropy of the subshift a : X — ► X. When X is a subshift of finite type Xm with 
irreducible and aperiodic transition matrix M, there is a unique measure of maximal 
entropy, see, e.g., |43j . As a Gibbs state it satisfies (I16|) . By summing over all cylinders 
[xq . . . x n -\] allowed by M, it is easy to see that the topological entropy -P(O) is the 
asymptotic exponential growth rate of the number of sequences of length n that can 
occur as initial segments of points in Xm- This is obviously identical to the logarithm 
of the largest eigenvalue of the transition matrix M. 

It is not difficult to verify that the total number of periodic sequences of period n equals 
the trace of the matrix M n , i.e., we have the formula 

m 
Card{x G X M : o n x = x} = tr(M") = J^ A™, 

i=l 

where Ai,...,A m are all the eigenvalues of M. Asymptotically, of course, Card{x G 
Xm '■ P n x = x} = e nP ^i + 0(|A'| n ), where A' is the second largest (in absolute value) 
eigenvalue of M. 
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The measure of maximal entropy, call it /j,q, describes the distribution of periodic points 
in Xm '■ one can prove [3 [37] that for any cylinder B C Xm 

Cardix G B : a n x = x} . _. 

lim n as r- v n T = Mo(S). 

In other words, the finite atomic measure that assigns equal weights 1/Card{x G Xm '■ 
a n x = x} to each periodic point in Xm with period n weakly converges to fiQ, as 
n — > oo. Each such measure has zero entropy while /i(/^o) = -P(O) > 0, so the entropy is 
not continuous on the space of invariant measures. It is, however, upper-semicontinuous 
(see Subsection 14. 51) . 

In fact, it is possible to approximate any Gibbs state /z^ on Xm in a similar way, by 
finite atomic measures on periodic orbits, by assigning weights properly (see, e.g., [371 
Theorem 20.3.7]). 



7.2. Markov Chains over Finite Alphabets. Let Q = {q a ,b)a,b&A De an irreducible 
stochastic matrix over the finite alphabet A. It is well known (see, e.g., |63| ) that there 
exists a unique probability vector n on A that defines a stationary Markov measure vq 
on X = A N by v Q [a . . . a n -i] = n ao q aoai . . . q an _ 2 a n . x - We are going to identify vq as 
the unique Gibbs distribution /i € A4 a (X) that maximises entropy under the constraints 
H[ab] = fi[a]q ab , i.e., (4> ab ,n) = (a, b G A), where (p ab := l [o6 ] - q a b\}- Indeed, as jx 
is a Gibbs measure, there are (3 a b £ R (a, & € i) and constants P € R, C > such 
that 

(20) C- 1 < ^ x °--- x "- 1 ] < C 

V ^ "exp(E a , feGA /3a fe < fe fe)-nP) " 

for all x G A K and all n G N. Let r ab := exp(/3 a b — ^ 6 , £A f3 a y q a y — P). Then the 
denominator in (|20p equals r Xoa;i . . . r Xn _ 2Xn _ 1 , and it follows that /i is equivalent to the 
stationary Markov measure defined by the (non-stochastic) matrix (r a b) a ,beA- As /x is 
ergodic, \i is this Markov measure, and as fj, satisfies the linear constraints fi[ab] = (j\a\q a bi 
we conclude that fi = vq. 

7.3. The Ising Chain. Here the task is to characterise all "spin chains" inxG {—1, +1} N 
(or, more commonly, {— 1, +1} Z ) which are as random as possible with the constraint 
that two adjacent spins have a prescribed probability p ^ ^ to be identical. With 
4>{x) := xqX\ this is equivalent to requiring that x is typical for a Gibbs distribution 
[ipcf, where (3 = (3{p) is such that {(fi,/!^) = 2p — 1. It follows that there is a constant 
C > such that for each n G N and any two "spin patterns" a = oq . . . a n -\ and 
b = b ■ ■ ■ &n-l 



log ^o...a n ^] {Na _ Nb) 

fi(3(j)[0o . . . O n -i\ 



CC 



where N a and iVj, are the numbers of identical adjacent spins in a and b, respec- 
tively. 
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7.4. More on Hofbauer's Example. We can come back to the example described in 
Subsection 15.31 It is easy to verify that in that example va,rk+i(4>) = \ a k\- For instance, 
if afc = — t^tW there is a unique Gibbs/equilibrium state. If a& = — 31og(-^-) for 

k > 1 and oq = — log X^fci J 3 ) then from [31] we know that <p admits more than 
one equilibrium state, one of them being <5n..., which cannot be a Gibbs state for any 
continuous 6. 



8. Examples from Differentiable Dynamics 

In this section we present a number of examples to which the general theory developed 
above does not apply directly but only after a transfer of the theory from a symbolic 
space to a manifold. We restrict to examples where the results can be transferred 
because those aspects of the smooth dynamics we focus on can be studied as well on 
a shift dynamical systems that is obtained from the original one via symbolic coding. 
(We do not discuss the coding process itself which is sometimes far from trivial, but we 
focus on the application of the Gibbs and equilibrium theory.) There are alternative 
approaches where instead of the results the concepts and (partly) the strategies of proofs 
are transferred to the smooth dynamical systems. This has lead both to an extension 
of the range of possible applications of the theory and to a number of refined results 
(because some special features of smooth systems necessarily get lost by transferring the 
analysis to a completely disconnected metric space). 

In the following examples, T denotes a (possibly piecewise) differentiable map of a com- 
pact smooth manifold M. Points on the manifold are denoted by u and v. In all examples 
there is a Holder continuous coding map ir : X — > M from a subshift of finite type X 
onto the manifold which respects the dynamics, i.e., Ton = n o <j. This factor map 
7r is "nearly" invertible in the sense that the set of points for in M with more than 
one preimage under ir has measure zero for all T-invariant measures we are interested 
in. Hence such measures jl on M correspond unambiguously to shift invariant measures 
u = p, o 7T _1 . Similarly observables <fi on M and 4> = <f> o -k on X are related. 

8.1. Uniformly Expanding Markov Maps of the Interval. A transformation T on 
M := [0, 1] is called an Markov map, if there are = no < «i < • • • < u^ = 1 such 
that each restriction T\i Ui _ lUi \ is strictly monotone, C 1+r for some r > 0, and maps 
(ui-i,Ui) onto a union of some of these N monotonicity intervals. It is called uniformly 
expanding if there is some k 6 N such that A := mf x \(T )'(x)| > 1. It is not difficult to 
verify that the symbolic coding of such a system leads to a topological Markov chain over 
the alphabet A = {1, . . . , iV}. To simplify the discussion we assume that the transition 
matrix M of this topological Markov chain is irreducible and aperiodic. 

Our goal is to find a T-invariant measure jl represented by [i G Ai ff (lM) which minimises 
the relative entropy to Lebesgue measure on [0, 1] 

h(p\m) := lim - V] u[a . . .a„_i] log — - ' - n ~ 1 

n-*oon ^ f n [a ■ ■ ■ a n -i\ 

ao,...,a„_i€{l,...,iV} 
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where f n [ a o • ■ ■ a n-i] := \Ia — a n -il- (Recall that, without insisting on invariance, this 
would just be Lebesgue measure itself.) The existence of the limit will be justified below 

- observe that m is not a Gibbs state as in Section [6l The argument rests on the simple 
observation (implied by the uniform expansion and the piecewise Holder-continuity of 
T") that T has bounded distortion, i.e., that there is a constant C > such that for all 
n G N, a ■ ■ ■ a n -i G {1, . . . , N} n and u G I ao ...a n _ 1 holds 

(21) C" 1 < |I ao ... aB _ 1 | • |(T»)'(u)| < C, or, equivalents C" 1 < '^r 1 ' < C 

exp(5 n 0(u)) 

where 4>(u) := — log |T'(tt)|. (Observe the similarity between this property and the Gibbs 
property (|16|) .) Assuming bounded distortion we have at once 

h(fi\m) = lim - -H n (/j,)-y]{<j)oc- k ,iJ,) = -h(/t) - (<£,//), 

n-»oo n \ L — ' / 

\ fc=0 / 

and minimising this relative entropy just amounts to maximising h(fi) + {<fi, fj) for (f> = 

— log |T"| o 7r. As the results on Gibbs distributions from Section [B] apply, we conclude 
that 

c -i < mK • • • fl "-i] <• c 

|-*O0...O n _l I 

for some C > 0. So the unique T-invariant measure jl that minimises the relative entropy 
h(jl\m) is equivalent to Lebesgue measure m. (The existence of an invariant probability 
measure equivalent to m is well-known, also without invoking entropy theory. It is 
guaranteed by a "Folklore Theorem" |33j.) 

8.2. Interval Maps with an Indifferent Fixed Point. The presence of just one 
point x G [0, 1] such that T'(x) = 1 dramatically changes the properties of the system. 
A canonical example is the map T a : x i— ► x(l + 2 a x a ) if x G [0, l/2[ and x i— > 2x — 1 if 
x G [1/2, 1]. We have T"(0) = 1, i.e., is an indifferent fixed point. For a G [0, 1[ this 
map admits an absolutely continuous invariant probability measure dfi(x) = h(x)dx, 
where h(x) ~ x~ a when x — > |66j . In the physics literature, this type of map is known 
as "Manneville-Pomeau" map. It was introduced as a model of transition from laminar 
to intermittent behaviour [50J. In [28] the authors construct a piecewise affine version of 
this map to study the complexity of trajectories (in the sense of Subsection I4.4J) . This 
gives rise to a countable state Markov chain. In [69] the close connection to the Fisher- 
Felderhof model and Hofbauer's example (see Subsection I5.3[) was realised. We refer to 
|61| for recent developments and a list of references. 



8.3. Axiom A Diffeomorphisms, Anosov Diffeomorphisms, Sinai-Ruelle-Bowen 
Measures. The first spectacular application of the theory of Gibbs measures to differ- 
entiable dynamical systems was Sinai's approach to Anosov diffeomorphism via Markov 
partitions [M] that allowed to code the dynamics of these maps into a subshift of fi- 
nite type and to study their invariant measures by methods from equilibrium statistical 
mechanics [65] that had been developed previously by Dobrushin, Lanford and Ruelle 
[13 13 [191 EH H] • Not much later this approach was extended by Bowen [2] to Smale's 
Axiom A diffeomorphisms (and to Axiom A flows by Bowen and Ruelle [7]); see also 
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The interested reader can consult, e.g., [71] for a survey, and either [5] or [15] for 

details. 

Both types of diffeomorphisms act on a smooth compact Riemannian manifold M and 
are characterised by the existence of a compact T-invariant hyperbolic set A C M . Their 
basic properties are described in detail in the contribution [[Nicoll-Petersen]]. Very 
briefly, the tangent bundle over A splits into two invariant subbundles - a stable one and 
an unstable one. Correspondingly, through each point of A there passes a local stable and 
a local unstable manifold which are both tangent to the respective subspaces of the local 
tangent space. The unstable derivative of T, i.e., the derivative DT restricted to the 
unstable subbundle, is uniformly expanding. Its Jacobian determinant, denoted by J", 
is Holder continuous as a function on A. Hence the observable <p( u > := —log \j( u '\ o it 
is Holder continuous, and the Gibbs and equilibrium theory apply (via the symbolic 
coding) to the diffeomorphism T (modulo possibly a decomposition of the hyperbolic 
set into irreducible and aperiodic components, called basic sets, that can be modelled 
by topologically mixing subshifts of finite type). The main results are: 

Characterisation of attractors 

The following assertions are equivalent for a basic set f2 C A: 

(i) £2 is an attractor, i.e., there are arbitrarily small neighbourhoods U C M of Q, such 
that TU C U. 

(ii) The union of all stable manifolds through points of £7 is a subset of M with positive 
volume. 

(iii) The pressure PT\ n {4> ) = 0. 

In this case the unique equilibrium and Gibbs state /i + of T|n is called the Sinai- Ruelle- 
Bowen (SRB) measure of T\q. It is uniquely characterised by the identity hx\ n (^ + ) = 
— (<fi( u ' , // + ) . (For all other T-invariant measures on O one has "<" instead of "=" .) 

Further properties of SRB measures 

Suppose Pr\ n (^ ) = an d let M + De the SRB measure, 
(a) For a set of points u € M of positive volume we have: 



hm -V/(A) = (/,/* + ). 

n^oo n * — ' 
fc=0 

(Indeed, because of (ii) of the above characterisation, this holds for almost all points 
of the union of the stable manifolds through points of f2.) 

(b) Conditioned on unstable manifolds, fi + is absolutely continuous to the volume mea- 
sure on unstable manifolds. 

In the special case of transitive Anosov diffeomorphisms, the whole manifold is a hyper- 
bolic set and f2 = M. Because of transitivity, property (ii) from the characterisation of 
attractors is trivially satisfied, so there is always a unique SRB measure fi + . As T _1 is 
an Anosov diffeomorphism as well - only the roles of stable and unstable manifolds are 
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interchanged - T~ l has a unique SRB measure fj,~ which is the unique equilibrium state 
of T _1 (and hence also of T) for <ft( s > := log |J^|. One can show: 

SRB measures for Anosov diffeomorphisms 

The following assertions are equivalent: 

(i) fi+ = fi~. 

(ii) fj, + or fj,~ is absolutely continuous w.r.t the volume measure on M. 

(iii) For each periodic point u = T n u G M , | J(u)\ = 1, where J denotes the determinant 
oiDT. 

We remark that, similarly as in the case of Markov interval maps, the unstable Jacobian 
of T n at u is asymptotically equivalent to the volume of the "n-cylinder" of the Markov 
partition around u. So the maximisation of h(fi) + (<p u ),/i) by the SRB measure /u + can 
again be interpreted as the minimisation of the relative entropy of invariant measures 
with respect to the normalised volume, and the fact that P((j)( u ') = in the Anosov (or 
more generally attractor) case means that fi + is as close to being absolutely continuous 
as it is possible for a singular measure. This is reflected by the above properties (a) 
and (b). 

We emphasise the meaning of property (a) above: it tells us that the SRB measure fi + is 
the only physically observable measure. Indeed, in numerical experiments with physical 
models, one picks an initial point u G M "at random" (i.e., with respect to the volume 
or Lebesgue measure) and follows its orbit T u, k > 0. 

8.4. Bowen's Formula for the Hausdorff Dimension of Conformal Repellers. 

Just as nearby orbits converge towards an attractor, they diverge away from a repeller. 
Conformal repellers form a nice class of systems which can be coded by a subshift of 
finite type. The construction of their Markov partitions is much simpler than that of 
Anosov diffeomorphisms, see, e.g., |72j . 

Let us recall the definition of a conformal repeller before giving a fundamental example. 
Given a holomorphic map T : V — > C where FcCis open and J a compact subset of 
C, one says that (J, V, T) is a conformal repeller if 

(i) there exist C > 0, a > 1 such that \{T n )'(z)\ > Ca n for all z G J, n > 1; 

(ii) J = n n >iT-"00and 

(iii) for any open set U such that U n J ^ 0, there exists n such that T n (U D J) D J. 

From the definition it follows that T(J) = J and T" 1 ^) = J. 

A fundamental example is the map T : z ^ z 2 + c, cGC being a parameter. It can 
be shown that for \c\ < \ there exists a compact set J, called a (hyperbolic) Julia set, 
such that (J, C, T) is a conformal repeller. As usual, C denotes the Riemann sphere (the 
compactification of C). 
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Conformal repellers J are in general fractal sets and one can measure their "degree of 
fractality" by means of their Hausdorff dimension, dim H {J). Roughly speaking, one 
computes this dimension by covering the set J by balls with radius less than or equal 
to 5. If Ns(J) denotes the cardinality of the smallest such covering, then we expect 
that 

N S (J) ~ 5' dimH{J) , as 5^0. 
We refer the reader to [[Schmeling]] or |22} l^tj] for a rigorous definition (based on 
Caratheodory's construction) and for more informations on fractal geometry. 

Bowen's formula relates dim H {J) to the unique zero of the pressure function (3 \— ► P(/3<p) 
where 4> '■= —(log |T'|)| J. It is not difficult to see that indeed this map has a unique zero 
for some positive (3. 

► By property (i), S n 4> < const — nloga, which implies (by (fT3]) ) that -&P(P4>) = (4>, Up) < 
— log a < 0. As -P(O) equals the topological entropy of J, i.e., the logarithm of the largest 
eigenvalue of the matrix M associated to the Markov partition, P(0) is strictly positive. Therefore 
(recall that the pressure function is continuous) there exists a unique number /3o > such that 

p(Po4>) = o. + 

It turns out that this unique zero is precisely dim H {J): 

Bowen's formula. The Hausdorff dimension of J is the unique solution of the equation 
P(P4>) = 0, [3 e K; in particular 

P(dim H (J)0) = 0. 



This formula was proven in [55| for a general class of conformal repellers after the seminal 
paper [B] . The main tool is a distortion estimate very similar to (|21|) . A simple exposition 
can be found in |72j . 

9. NONEQUILIBRIUM STEADY STATES AND ENTROPY PRODUCTION 

SRB measures for Anosov diffeomorphisms and Axiom A attractors have been accepted 
recently as conceptual models for nonequilibrium steady states in nonequilibrium sta- 
tistical mechanics. Let us point out that the word "equilibrium" is used in physics in 
a much more restricted sense than in ergodic theory. Only diffeomorphisms preserving 
the natural volume of the manifold (or a measure equivalent to the volume) would be 
considered as appropriate toy models of physical equilibrium situations. In the case of 
Anosov diffeomorphisms this is precisely the case if the "forward" and "backward" SRB 
measures fj, + and fj,~ coincide. Otherwise the diffeomorphism models a situation out 
of equilibrium, and the the difference between /i + and /x~ can be related to entropy 
production and irreversibility. 

Gallavotti and Cohen [271 [2tj] introduced SRB measures as idealised models of nonequi- 
librium steady states around 1995. In order to have as firm a mathematical basis as 
possible they made the "chaotic hypothesis" that the systems they studied behave like 
transitive Anosov systems. Ruelle [56] extended their approach to more general (even 
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nonuniformly) hyperbolic dynamics; see also his reviews [591157] for more recent accounts 
discussing also a number of related problems; see also [51]. The importance of the Gibbs 
property of SRB measures for the discussion of entropy production was also highlighted 
in [35] , where it is shown that for transitive Anosov diffeomorphisms the relative entropy 
h(n + \[i~) equals the average entropy production rate (log | J\, jx + ) of [i + where J denotes 
again the Jacobian determinant of the diffeomorphism. In particular, the entropy pro- 
duction rate is zero if, and only if, h(fi + \fi^) = 0, i.e., using coding and (fl9j) . if, and 
only if, n + = fx~. According to Subsection 18.31 this is also equivalent to fi + or fi~ being 
absolutely continuous with respect to the volume measure. 

10. Some Ongoing Developments and Future Directions 

As we saw, many dynamical systems with uniform hyperbolic structure (e.g., Anosov 
maps, axiom A diffeomorphisms) can be modelled by subshifts of finite type over a finite 
alphabet. We already mentioned in Subsection 18.21 the typical example of a map of the 
interval with an indifferent fixed point, whose symbolic model is still a subshift of finite 
type, but with a countable alphabet. The thermodynamic formalism for such systems 
is by now well developed [231 ESS EH EH [62] and used for multidimensional piecewise 
expanding maps [13] . An active line of research is related to systems admitting repre- 
sentations by symbolic models called "towers" constructed by using "inducing schemes" . 
The fundamental example is the class of one-dimensional unimodal maps satisfying the 
"Collet-Eckmann condition". A first attempt to develop thermodynamic formalism for 
such systems was made in [10] where existence and uniqueness of equilibrium measures 
for the potential function (f>p(u) = — [3 log \T'(u)\ with (3 close to 1 was established. Very 
recently, new developments in this direction appeared, see, e.g., [TT j [T2l |4"T] . 

A largely open field of research concerns a new branch of nonequilibrium statistical 
mechanics, the so-called "chaotic scattering theory", namely the analysis of chaotic 
systems with various openings or holes in phase space, and the corresponding repellers on 
which interesting invariant measures exist. We refer the reader to [U] for a brief account 
and references to the physics literature. The existence of (generalised) steady states on 
repellers and the so-called "escape rate formula" have been observed numerically in a 
number of models. So far, little has been proven mathematically, except for Anosov 
diffeomorphisms with special holes [15] and for certain nonuniformly hyperbolic systems 
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